A Cost-Effective Statistical Method to Correct for Differential Genotype Misclassification When Performing Case-Control Genetic Association
ثبت نشده
چکیده
Background/Aims: There is a growing interest regarding the effect of differential misclassification on power and type I error rate in genome-wide association studies. We present an extension of a previously published test statistic: the likelihood ratio test allowing for errors ( LRT AE ). This test uses double-sample information on a subset of individuals to increase power for genetic association in the presence of nondifferential misclassification. Methods: We extend the original LRT AE by allowing for differential genotype misclassification between case and control populations. We label this new statistic as LRT D AE . We test the performance of this statistic with data simulated under differential misclassification specifications and two different types of genetic models: null and power. For simulations using the null model, we specify that there is no difference between case and control genotype frequencies before the introduction of errors. For simulations under power, we consider three modes of inheritance: dominant, multiplicative, and recessive. Results: We show that the LRT D AE , with p values computed using permutation, Received: March 25, 2009 Accepted after revision: April 27, 2010 Published online: July 3, 2010 Derek Gordon, PhD Department of Genetics 145 Bevier Road Piscataway, NJ 08854 (USA) Tel. +1 732 445 3386, Fax +1 732 445 1147, E-Mail gordon @ biology.rutgers.edu © 2010 S. Karger AG, Basel 0001–5652/10/0702–0102$26.00/0 Accessible online at: www.karger.com/hhe D ow nl oa de d by : 54 .7 0. 40 .1 1 10 /6 /2 01 7 4: 33 :4 3 A M LRT for Differential Genotyping Errors Hum Hered 2010;70:102–108 103 ported inflation of the false-positive rate for a case-control study of type I diabetes in the United Kingdom. The authors proposed a weighting scheme for the test statistics to treat this inflation. As noted by these authors, however, the weighting can cause a decrease in power to detect association. Pearce et al. [15] showed that in a previously published association study of prostate cancer [19] , the reported finding was a false positive. In the initial study cases were called much earlier than controls causing different group error rates. Moskvina et al. [14] performed a simulation study to document the effect of differential misclassification on the probability of false-positive association showing that unequal case-control error probabilities can cause substantial inflation in the type I error rate, as was indeed observed by Clayton et al. [13] and Pearce et al. [15] . Plagnol et al. [12] proposed a modified version of the SNP clustering algorithm of Moorhead et al. [20] in order to reduce the differential bias in genotype scoring between cases and controls. Since differential misclassification also affects genotype missing calls, the authors adapted association tests to deal with uncertain SNP calls (normally labeled as missing) in order to avoid the extra bias introduced by unequal missing rates. Marquard et al. [16] examined the effect of differential genotype misclassification in the type I error rate of three haplotype-based association methods and Cheng and Lin [17] evaluated its effect on studies of gene-environment interaction. Finally, Ahn et al. [18] showed that, for fixed differential error frequencies, when the allele frequencies are equal between cases and controls, under Hardy-Weinberg equilibrium the rejection rates of the 2 and linear trend tests increase as the minor allele frequency decreases and the sample size increases. Differential error rates can occur due to differences in DNA quality or extraction protocols between cases and controls [13] or through the use of public controls. Different rates may also occur when cases and controls are genotyped at different times or when researches employ different genotype calling technologies. Furthermore, as the use of copy number polymorphisms in association studies increases, so will the impact of differential genotype misclassification [21] . In this article we propose an extension of a previously published test statistic: the likelihood ratio test allowing for errors ( LRT AE ) [22] . This test uses double-sample information [23, 24] on a subset of individuals to increase power for genetic association in the presence of nondifferential misclassification. By double-sample, we mean that a subset of individuals are genotyped by both the standard genotyping mechanism, which is subject to misclassification, and a gold-standard genotyping mechanism that has much lower misclassification rates. We extend the original LRT AE by allowing for differential genotype misclassification between case and control populations. This extension is accomplished by assuming that an individual’s observed genotype is conditionally dependent upon both the individual’s true genotype and true phenotype. In this work we do not consider phenotype misclassification. We perform simulations to evaluate the type I error rate and power of our method in the presence of differential genotype misclassification. We use the notation LRT A E to denote the statistic that allows for differential misclassification.
منابع مشابه
A cost-effective statistical method to correct for differential genotype misclassification when performing case-control genetic association.
BACKGROUND/AIMS There is a growing interest regarding the effect of differential misclassification on power and type I error rate in genome-wide association studies. We present an extension of a previously published test statistic: the likelihood ratio test allowing for errors (LRTAE). This test uses double-sample information on a subset of individuals to increase power for genetic association ...
متن کاملA Cost-Effective Statistical Method to Correct for Differential Genotype Misclassification When Performing Case-Control Genetic Association
Background/Aims: There is a growing interest regarding the effect of differential misclassification on power and type I error rate in genome-wide association studies. We present an extension of a previously published test statistic: the likelihood ratio test allowing for errors ( LRT AE ). This test uses double-sample information on a subset of individuals to increase power for genetic associat...
متن کاملStatistical adjustment of genotyping error in a case–control study of childhood leukaemia
BACKGROUND Genotyping has become more cost-effective and less invasive with the use of buccal cell sampling. However, low or fragmented DNA yields from buccal cells collected using FTA cards often requires additional whole genome amplification to produce sufficient DNA for genotyping. In our case-control study of childhood leukaemia, discordance was found between genotypes derived from blood an...
متن کاملA new expectation-maximization statistical test for case-control association studies considering rare variants obtained by high-throughput sequencing.
Genome-wide association studies (GWAS) have been successful in identifying common genetic variation reproducibly associated with disease. However, most associated variants confer very small risk and after meta-analysis of large cohorts a large fraction of expected heritability still remains unexplained. A possible explanation is that rare variants currently undetected by GWAS with SNP arrays co...
متن کاملIncreasing power for tests of genetic association in the presence of phenotype and/or genotype error by use of double-sampling.
Phenotype and/or genotype misclassification can: significantly increase type II error probabilities for genetic case/control association, causing decrease in statistical power; and produce inaccurate estimates of population frequency parameters. We present a method, the likelihood ratio test allowing for errors (LRTae) that incorporates double-sample information for phenotypes and/or genotypes ...
متن کامل